66 research outputs found

    Automatic extraction of robotic surgery actions from text and kinematic data

    Get PDF
    The latest generation of robotic systems is becoming increasingly autonomous due to technological advancements and artificial intelligence. The medical field, particularly surgery, is also interested in these technologies because automation would benefit surgeons and patients. While the research community is active in this direction, commercial surgical robots do not currently operate autonomously due to the risks involved in dealing with human patients: it is still considered safer to rely on human surgeons' intelligence for decision-making issues. This means that robots must possess human-like intelligence, including various reasoning capabilities and extensive knowledge, to become more autonomous and credible. As demonstrated by current research in the field, indeed, one of the most critical aspects in developing autonomous systems is the acquisition and management of knowledge. In particular, a surgical robot must base its actions on solid procedural surgical knowledge to operate autonomously, safely, and expertly. This thesis investigates different possibilities for automatically extracting and managing knowledge from text and kinematic data. In the first part, we investigated the possibility of extracting procedural surgical knowledge from real intervention descriptions available in textbooks and academic papers on the robotic-surgical domains, by exploiting Transformer-based pre-trained language models. In particular, we released SurgicBERTa, a RoBERTa-based pre-trained language model for surgical literature understanding. It has been used to detect procedural sentences in books and extract procedural elements from them. Then, with some use cases, we explored the possibilities of translating written instructions into logical rules usable for robotic planning. Since not all the knowledge required for automatizing a procedure is written in texts, we introduce the concept of surgical commonsense, showing how it relates to different autonomy levels. In the second part of the thesis, we analyzed surgical procedures from a lower granularity level, showing how each surgical gesture is associated with a given combination of kinematic data

    Automatic detection of procedural knowledge in robotic-assisted surgical texts

    Get PDF
    Purpose The automatic extraction of knowledge about intervention execution from surgical manuals would be of the utmost importance to develop expert surgical systems and assistants. In this work we assess the feasibility of automatically identifying the sentences of a surgical intervention text containing procedural information, a subtask of the broader goal of extracting intervention workflows from surgical manuals. Methods We frame the problem as a binary classification task. We first introduce a new public dataset of 1958 sentences from robotic surgery texts, manually annotated as procedural or non-procedural. We then apply different classification methods, from classical machine learning algorithms, to more recent neural-network approaches and classification methods exploiting transformers (e.g., BERT, ClinicalBERT). We also analyze the benefits of applying balancing techniques to the dataset. Results The architectures based on neural-networks fed with FastText’s embeddings and the one based on ClinicalBERT outperform all the tested methods, empirically confirming the feasibility of the task. Adopting balancing techniques does not lead to substantial improvements in classification. Conclusion This is the first work experimenting with machine / deep learning algorithms for automatically identifying procedural sentences in surgical texts. It also introduces the first public dataset that can be used for benchmarking different classification methods for the task

    The Robotic Surgery Procedural Framebank

    Get PDF
    Robot-Assisted minimally invasive surgery is the gold standard for the surgical treatment of many pathological conditions, and several manuals and academic papers describe how to perform these interventions. These high-quality, often peer-reviewed texts are the main study resource for medical personnel and consequently contain essential procedural domain-specific knowledge. The procedural knowledge therein described could be extracted, e.g., on the basis of semantic parsing models, and used to develop clinical decision support systems or even automation methods for some procedure’s steps. However, natural language understanding algorithms such as, for instance, semantic role labelers have lower efficacy and coverage issues when applied to domain others than those they are typically trained on (i.e., newswire text). To overcome this problem, starting from PropBank frames, we propose a new linguistic resource specific to the robotic-surgery domain, named Robotic Surgery Procedural Framebank (RSPF). We extract from robotic-surgical texts verbs and nouns that describe surgical actions and extend PropBank frames by adding any of new lemmas, frames or role sets required to cover missing lemmas, specific frames describing the surgical significance, or new semantic roles used in procedural surgical language. Our resource is publicly available and can be used to annotate corpora in the surgical domain to train and evaluate Semantic Role Labeling (SRL) systems in a challenging fine-grained domain setting

    Surgicberta: a pre-trained language model for procedural surgical language

    Get PDF
    Pre-trained language models are now ubiquitous in natural language processing, being successfully applied for many different tasks and in several real-world applications. However, even though there is a wealth of high-quality written materials on surgery, and the scientific community has shown a growing interest in the application of natural language processing techniques in surgery, a pre-trained language model specific to the surgical domain is still missing. The creation and public release of such a model would serve numerous useful clinical applications. For example, it could enhance existing surgical knowledge bases employed for task automation, or assist medical students in summarizing complex surgical descriptions. For this reason, in this paper, we introduce SurgicBERTa, a pre-trained language model specific for the English surgical language, i.e., the language used in the surgical domain. SurgicBERTa has been obtained from RoBERTa through continued pre-training with the Masked language modeling objective on 300 k sentences taken from English surgical books and papers, for a total of 7 million words. By publicly releasing SurgicBERTa, we make available a resource built from the content collected in many high-quality surgical books, online textual resources, and academic papers. We performed several assessments in order to evaluate SurgicBERTa, comparing it with the general domain RoBERTa. First, we intrinsically assessed the model in terms of perplexity, accuracy, and evaluation loss resulting from the continual training according to the masked language modeling task. Then, we extrinsically evaluated SurgicBERTa on several downstream tasks, namely (i) procedural sentence detection, (ii) procedural knowledge extraction, (iii) ontological information discovery, and (iv) surgical terminology acquisition. Finally, we conducted some qualitative analysis on SurgicBERTa, showing that it contains a lot of surgical knowledge that could be useful to enrich existing state-of-the-art surgical knowledge bases or to extract surgical knowledge. All the assessments show that SurgicBERTa better deals with surgical language than a general-purpose pre-trained language model such as RoBERTa, and therefore can be effectively exploited in many computer-assisted applications in the surgical domain

    Machine understanding surgical actions from intervention procedure textbooks

    Get PDF
    The automatic extraction of procedural surgical knowledge from surgery manuals, academic papers or other high-quality textual resources, is of the utmost importance to develop knowledge-based clinical decision support systems, to automatically execute some procedure’s step or to summarize the procedural information, spread throughout the texts, in a structured form usable as a study resource by medical students. In this work, we propose a first benchmark on extracting detailed surgical actions from available intervention procedure textbooks and papers. We frame the problem as a Semantic Role Labeling task. Exploiting a manually annotated dataset, we apply different Transformer-based information extraction methods. Starting from RoBERTa and BioMedRoBERTa pre-trained language models, we first investigate a zero-shot scenario and compare the obtained results with a full fine-tuning setting. We then introduce a new ad-hoc surgical language model, named SurgicBERTa, pre-trained on a large collection of surgical materials, and we compare it with the previous ones. In the assessment, we explore different dataset splits (one in-domain and two out-of-domain) and we investigate also the effectiveness of the approach in a few-shot learning scenario. Performance is evaluated on three correlated sub-tasks: predicate disambiguation, semantic argument disambiguation and predicate-argument disambiguation. Results show that the fine-tuning of a pre-trained domain-specific language model achieves the highest performance on all splits and on all sub-tasks. All models are publicly released

    Mapping natural language procedures descriptions to linear temporal logic templates: an application in the surgical robotic domain

    Get PDF
    Natural language annotations and manuals can provide useful procedural information and relations for the highly specialized scenario of autonomous robotic task planning. In this paper, we propose and publicly release AUTOMATE, a pipeline for automatic task knowledge extraction from expert-written domain texts. AUTOMATE integrates semantic sentence classifcation, semantic role labeling, and identifcation of procedural connectors, in order to extract templates of Linear Temporal Logic (LTL) relations that can be directly implemented in any sufciently expressive logic programming formalism for autonomous reasoning, assuming some low-level commonsense and domain-independent knowledge is available. This is the frst work that bridges natural language descriptions of complex LTL relations and the automation of full robotic tasks. Unlike most recent similar works that assume strict language constraints in substantially simplifed domains, we test our pipeline on texts that refect the expressiveness of natural language used in available textbooks and manuals. In fact, we test AUTOMATE in the surgical robotic scenario, defning realistic language constraints based on a publicly available dataset. In the context of two benchmark training tasks with texts constrained as above, we show that automatically extracted LTL templates, after translation to a suitable logic programming paradigm, achieve comparable planning success in reduced time, with respect to logic programs written by expert programmer

    Reading as an Enabling Technology: Informing Surgical Robots with Spatial Information

    Get PDF
    The two key challenges in mining surgical textbooks for executable information are extracting structured high-level information from texts written in natural language and presenting the information thus extracted to the systems modules that need it, in a format that is suitable for use. This contribution focuses on the latter challenge, a key integration step toward cognitive surgical robotic

    Inductive learning of surgical task knowledge from intra-operative expert feedback

    Get PDF
    Knowledge-based and particularly logic-based systems for task planning and execution guarantee trustability and safety of robotic systems interacting with humans. However, domain knowledge is usually incomplete. This paper proposes a novel framework for task knowledge refinement from real-time user feedback, based on inductive logic programming

    A Framework for the Design and Simulation of Embedded Vision Applications Based on OpenVX and ROS

    Get PDF
    Customizing computer vision applications for embedded systems is a common and widespread problem in the cyber-physical systems community. Such a customization means parametrizing the algorithm by considering the external environment and mapping the Software application to the heterogeneous Hardware resources by satisfying non-functional constraints like performance, power, and energy consumption. This work presents a framework for the design and simulation of embedded vision applications that integrates the OpenVX standard platform with the Robot Operating System (ROS). The paper shows how the framework has been applied to tune the ORB-SLAM application for an NVIDIA Jetson TX2 board by considering different environment contexts and different design constraints

    ZusammenQA: Data Augmentation with Specialized Models for Cross-lingual Open-retrieval Question Answering System

    Get PDF
    This paper introduces our proposed system for the MIA Shared Task on Cross-lingual Open retrieval Question Answering (COQA). In this challenging scenario, given an input question the system has to gather evidence documents from a multilingual pool and generate from them an answer in the language of the question. We devised several approaches combining different model variants for three main components: Data Augmentation, Passage Retrieval, and Answer Generation. For passage retrieval, we evaluated the monolingual BM25 ranker against the ensemble of re-rankers based on multilingual pretrained language models (PLMs) and also variants of the shared task baseline, re-training it from scratch using a recently introduced contrastive loss that maintains a strong gradient signal throughout training by means of mixed negative samples. For answer generation, we focused on languageand domain-specialization by means of continued language model (LM) pretraining of existing multilingual encoders. Additionally, for both passage retrieval and answer generation, we augmented the training data provided by the task organizers with automatically generated question-answer pairs created from Wikipedia passages to mitigate the issue of data scarcity, particularly for the low-resource languages for which no training data were provided. Our results show that language- and domain-specialization as well as data augmentation help, especially for low-resource languages
    • …
    corecore